nFuse: Discovery of complex genomic rearrangements in cancer using high-throughput sequencing
نویسندگان
چکیده
Complex Genomic Rearrangements (CGRs) are emerging as a new feature of cancer genomes. CGRs are characterized by multiple genomic breakpoints, and thus have the potential to simultaneously affect multiple genes, fusing some genes and interrupting other genes. Analysis of high-throughput whole genome shotgun sequencing (WGSS) is beginning to facilitate the discovery and characterization of CGRs, but further development of computational methods is required. We have developed an algorithmic method for identifying CGRs in WGSS data based on shortest alternating paths in breakpoint graphs. Aiming for a method with the highest possible sensitivity, we use breakpoint graphs built from all WGSS data, including sequences with ambiguous genomic origin. Since the majority of cell function is encoded by the transcriptome, we target our search to find CGRs that underlie fusion transcripts predicted from matched high-throughput cDNA sequencing (RNA-seq) We have applied our method, nFuse, to the discovery of CGRs in publicly available data from the well-studied breast cancer cell line HCC1954 and primary prostate tumour sample 963. We first establish the sensitivity and specificity of the nFuse breakpoint prediction and scoring method using breakpoints previously discovered in HCC1954. We then validate 5 out of 6 CGRs in HCC1954 and 2 out of 2 CGRs in 963. We show examples of gene fusions that would be difficult to discover using methods that do not account for the existence of CGRs, including one important event that was missed in a previous study of the HCC1954 genome. Finally, we illustrate how CGRs may be used to infer the gene expression history of a tumour. ∗email: [email protected] †email: [email protected] 1 Cold Spring Harbor Laboratory Press on July 14, 2015 Published by genome.cshlp.org Downloaded from
منابع مشابه
nFuse: Discovery of complex genomic rearrangements in cancer using high-throughput sequencing Supplementary Text
Supplemental nFuse pipeline overview The nFuse method builds upon Comrad (McPherson et al., 2011b), our previous work on rearrangement detection in matched RNA-seq and WGSS. We begin this section by briefly describing Comrad, then describe significant differences between Comrad and nFuse. An overview of the nFuse pipeline is shown in Figure 1.
متن کاملDiscovery of Complex Genomic Rearrangements in Cancer Using High-Throughput Sequencing
Complex genomic rearrangements (CGRs) are emerging as a new feature of cancer genomes. CGRs are characterized by multiple genomic breakpoints and thus have the potential to simultaneously affect multiple genes, fusing some genes and interrupting other genes. Analysis of high-throughput whole-genome shotgun sequencing (WGSS) is beginning to facilitate the discovery and characterization of CGRs, ...
متن کاملFusionAnalyser: a new graphical, event-driven tool for fusion rearrangements discovery
Gene fusions are common driver events in leukaemias and solid tumours; here we present FusionAnalyser, a tool dedicated to the identification of driver fusion rearrangements in human cancer through the analysis of paired-end high-throughput transcriptome sequencing data. We initially tested FusionAnalyser by using a set of in silico randomly generated sequencing data from 20 known human translo...
متن کاملTranscriptome-guided characterization of genomic rearrangements in a breast cancer cell line.
We have identified new genomic alterations in the breast cancer cell line HCC1954, using high-throughput transcriptome sequencing. With 120 Mb of cDNA sequences, we were able to identify genomic rearrangement events leading to fusions or truncations of genes including MRE11 and NSD1, genes already implicated in oncogenesis, and 7 rearrangements involving other additional genes. This approach de...
متن کاملDELLY: structural variant discovery by integrated paired-end and split-read analysis
MOTIVATION The discovery of genomic structural variants (SVs) at high sensitivity and specificity is an essential requirement for characterizing naturally occurring variation and for understanding pathological somatic rearrangements in personal genome sequencing data. Of particular interest are integrated methods that accurately identify simple and complex rearrangements in heterogeneous sequen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012